9 research outputs found
Unpacking Large Language Models with Conceptual Consistency
If a Large Language Model (LLM) answers "yes" to the question "Are mountains
tall?" then does it know what a mountain is? Can you rely on it responding
correctly or incorrectly to other questions about mountains? The success of
Large Language Models (LLMs) indicates they are increasingly able to answer
queries like these accurately, but that ability does not necessarily imply a
general understanding of concepts relevant to the anchor query. We propose
conceptual consistency to measure a LLM's understanding of relevant concepts.
This novel metric measures how well a model can be characterized by finding out
how consistent its responses to queries about conceptually relevant background
knowledge are. To compute it we extract background knowledge by traversing
paths between concepts in a knowledge base and then try to predict the model's
response to the anchor query from the background knowledge. We investigate the
performance of current LLMs in a commonsense reasoning setting using the CSQA
dataset and the ConceptNet knowledge base. While conceptual consistency, like
other metrics, does increase with the scale of the LLM used, we find that
popular models do not necessarily have high conceptual consistency. Our
analysis also shows significant variation in conceptual consistency across
different kinds of relations, concepts, and prompts. This serves as a step
toward building models that humans can apply a theory of mind to, and thus
interact with intuitively
Learning Compositional Visual Concepts with Mutual Consistency
Compositionality of semantic concepts in image synthesis and analysis is
appealing as it can help in decomposing known and generatively recomposing
unknown data. For instance, we may learn concepts of changing illumination,
geometry or albedo of a scene, and try to recombine them to generate physically
meaningful, but unseen data for training and testing. In practice however we
often do not have samples from the joint concept space available: We may have
data on illumination change in one data set and on geometric change in another
one without complete overlap. We pose the following question: How can we learn
two or more concepts jointly from different data sets with mutual consistency
where we do not have samples from the full joint space? We present a novel
answer in this paper based on cyclic consistency over multiple concepts,
represented individually by generative adversarial networks (GANs). Our method,
ConceptGAN, can be understood as a drop in for data augmentation to improve
resilience for real world applications. Qualitative and quantitative
evaluations demonstrate its efficacy in generating semantically meaningful
images, as well as one shot face verification as an example application.Comment: 10 pages, 8 figures, 4 tables, CVPR 201
Confidence Calibration for Systems with Cascaded Predictive Modules
Existing conformal prediction algorithms estimate prediction intervals at
target confidence levels to characterize the performance of a regression model
on new test samples. However, considering an autonomous system consisting of
multiple modules, prediction intervals constructed for individual modules fall
short of accommodating uncertainty propagation over different modules and thus
cannot provide reliable predictions on system behavior. We address this
limitation and present novel solutions based on conformal prediction to provide
prediction intervals calibrated for a predictive system consisting of cascaded
modules (e.g., an upstream feature extraction module and a downstream
regression module). Our key idea is to leverage module-level validation data to
characterize the system-level error distribution without direct access to
end-to-end validation data. We provide theoretical justification and empirical
experimental results to demonstrate the effectiveness of proposed solutions. In
comparison to prediction intervals calibrated for individual modules, our
solutions generate improved intervals with more accurate performance guarantees
for system predictions, which are demonstrated on both synthetic systems and
real-world systems performing overlap prediction for indoor navigation using
the Matterport3D dataset
Ultrasmall inorganic cages directed by surfactant micelles
Functional silica nanoparticles have become highly relevant materials in the fields of biology and medicine. Ultrasmall fluorescent silica nanoparticles developed in our group (Cdots) have now reached phase 2 of clinical trials for cancer diagnostics. Nevertheless, modern nanomedicine techniques and their increasing complexity today are still in demand for more efficient and multifunctional tools for advanced applications such as theranostics. To this end, important developments have been made in order for these nanoparticles to achieve their full potential, including chemical modification of their matrix to improve their optical properties, and new synthetic strategies for multifunctional nanoparticles via a surface modification approach with various functional groups. In parallel, new alternative particle geometries have been investigated for targeted drug delivery applications.
In this contribution, we will review some of the recent progress made in our group that ultimately led to the discovery of highly symmetrical dodecahedral silica nanocages, or ‘silicages’ [1]. Ultrasmall (< 10 nm) silica nanoparticles with tunable geometries can be obtained through their templating with surfactant micelles. The self-assembly of silica clusters on these micelles gives rise to unique and well defined structures. The dodecahedral cage structure in particular is of great fundamental importance. It is the simplest of a set of Voronoi polyhedra suggested to form the smallest structural units of multiple forms of mesoporous silica, yet such highly symmetrical silica cages had never been isolated before. In order to resolve the actual structure of these ultrasmall objects, single-particle 3D reconstruction from tens of thousands of cryo-electron microscopy images was performed using a custom-built ‘Hetero’ machine learning algorithm. We will finally show that cage formation is not limited to silica, but has been observed for other materials including metals and transition metal oxides.
The chemical and practical value of this polyhedral structure may prove immense. Given the versatility of silica surface chemistry one can readily conceive of cage derivatives of many kinds, which may exhibit unusual properties and be useful in applications ranging from catalysis to drug delivery. For example, given recent success in the clinical translation of ultrasmall fluorescent silica nanoparticles with similar particle sizes and surface properties to these cages, one can envisage a range of new diagnostic and therapeutic probes with drugs hidden inside the cages.
Reference:
[1] K. Ma, Y. Gong, T. Aubert, M. Z. Turker, T. Kao, P. C. Doerschuk, U. Wiesner, Nature 2018, DOI: 10.1038/s41586-018-0221-0
Self-assembly of highly symmetrical, ultrasmall inorganic cages directed by surfactant micelles
Nanometre-sized objects with highly symmetrical, cage-like polyhedral shapes, often with icosahedral symmetry, have recently been assembled from DNA(1-3), RNA(4) or proteins(5,6) for applications in biology and medicine. These achievements relied on advances in the development of programmable self-assembling biological materials(7-10), and on rapidly developing techniques for generating three-dimensional (3D) reconstructions from cryo-electron microscopy images of single particles, which provide high-resolution structural characterization of biological complexes(11-13). Such single-particle 3D reconstruction approaches have not yet been successfully applied to the identification of synthetic inorganic nanomaterials with highly symmetrical cage-like shapes. Here, however, using a combination of cryo-electron microscopy and single-particle 3D reconstruction, we suggest the existence of isolated ultrasmall (less than 10 nm) silica cages ('silicages') with dodecahedral structure. We propose that such highly symmetrical, self-assembled cages form through the arrangement of primary silica clusters in aqueous solutions on the surface of oppositely charged surfactant micelles. This discovery paves the way for nanoscale cages made from silica and other inorganic materials to be used as building blocks for a wide range of advanced functional-materials applications
COMPUTATIONAL IMAGE UNDERSTANDING INCORPORATING PHYSICS-BASED MODELING AND EMPIRICAL LEARNING FOR REAL-WORLD APPLICATIONS
Challenging interdisciplinary applications inspire new methodological developments in data understanding. Two somewhat disjoint communities provide current solutions to data understanding. Statistical inference approaches based on abstract models allow incorporation of physics priors and parametric uncertainty. But to provide accurate models for complicated real-world data, one is often challenged by the curse of dimensionality. Alternatively, machine learning, especially the deep learning community, provides empirical descriptions of large complicated datasets. However, little prior knowledge is incorporated in current design of deep neural networks and such methods are often challenged by problems including data scarcity and limited transferability of the models. This dissertation includes methodological development in image understanding from each of the two perspectives: (1) Using statistical inference based on analytical models, 3-D spatial structure and temporal dynamics of nanoscale particles were reconstructed directly from large sets of cryo electron microscopy data. With a statistical framework incorporating the continuous heterogeneity among the imaged particles, a generative mechanical model was developed to provide sparse and analytical parametrization of the stochastic description of particle structure. This work contributes a systematic way to incorporate a fourth (temporal) dimension to the concept of 3D reconstruction. (2) Via deep neural networks-based machine learning approaches, the problem of concept learning in computer vision was investigated. Motivated by the challenge of data scarcity, a deep generative model-based framework, ConceptGAN, was developed to decompose data into transferable and composable semantic concepts and generatively recompose physically meaningful but unseen data, without complete training data over the joint latent space. It contributes a smart data augmentation technique which provides informative augmentation to improve the resilience of real-world applications. Finally, this dissertation concludes with a discussion on potential future research directions, in particular, on how methodological ideas from both the two perspectives of physics-based modeling and of deep learning can be fused to provide hybrid solutions that incorporate the strengths of both components, especially targeting real-world challenges including resilience, robustness, transferability and interpretability of the solutions